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Following a discussion of various methods of 
evaluating the effectiveness of developmental programs, this paper 
presents a research design for the evaluation of a developmental math 
program. First, the paper examines the objectives and benefits of 
conducting formative and sunmiative program evaluations. The paper 
then identifies the single group pre-test/post-test comparison as the 
most rommonly used method of evaluating remedial programs, and 
identifies the biases to which the method is vulnerable (e.g., test 
administration, student attitude, instrument, and external learning 
biases). Next, methods of reducing the influence of some of these 
biases are suggested. The paper then focuses on the marginally 
remedial/marginally exempted comparison, the remediated/exempted 
comparison, norm-group comparison, cross-program comparison, and 
historical comparison methods. The paper then recommends that program 
evaluators be objective, informed, comprehensive, pragmatic, 
political, selective, and prepared to compromise. The last sections 
describe the methods of data collection and analysis, and the 
findings of the developmental math program evaluation. The project 
(1) investigated the number of students who completed the math 
courses, the number who took a more advanced math course after 
completing the developmental course, and marks in advanced math 
courses; and (2) compared results from classes in which math anxiety 
was recognized and treated with those in which it was not. (LAL) 
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VKIh OnffJtRmON AND CESCBIFTIVE SIATISTICS 
FOR £7FE3CnVE PROGRAM 

I an 9PiQ9 to approach this topic in reverse or^dsr^loGking at 
the posBl bin tigs and pitfalls of evaluation of develcpaental programs 
and then the data ooXlecticn and descriptive statistics. 

First of all vie must decide vihsther we are looking for a **0Gm- 
prehensivs" or sumnative evaluation or, more siiiply, a fbmative one; 
one which will identify those elements in the instructicnaX program 
which contrUsute to its' effectiveness and thos^vihich need iinpxovanent. 
Studies liJce the latter are quicdc, sisple, and info)3iBti\«. They Bay be 
designed to examine any aspect of a remedial pi tx jia ui — fran texts, to 
methods of ins truc tion; from testing pi^cxsedures to exit criteria; fron 
which of the studoits are learning the oontsnt, to i^t oontent is being 
learned, formative evaluations do not usually hind thenselviK to gener- 
alizations but they are invaluable in putting the piu g ia u on the right 
track and ^eing that it stays there. Suninati^^ evaluations. Intended, 
as the objective indicates, to measure over all changes, may cover a 
muter of areas such as: 

1. j ^njt^ri ateness of Objecti^«es ; Are they actually the premise on 
which the program is based? Axe ease of than misguided or inappropriate? 

2. Appropriateness of Oontent to Prograo Objectives ; Is drill in 
fvaidoaental cperations of arithsetic neoessary in a t^ld of ino^ensive 
i»ad calculators? Is ability to figure ounround interest a key *^ future 

8U0068S? 

3. AwjKJpi. lateness of PlaaanerA Prodedursa ; Vfiiatevex- the basis ^ 
. testing, high school xeooxi^, interviews — what parosnt are undexplaoed 

and what pezont ofwxplaoed? 



4. Effectiveness of ^hstructicn ; Are the students leaming the 
remedial ocntent anA, if so, is the lemming the zosult of xenedial 
instructicn or extraneous factors? 

5« Efficiaicy of Instructior ; Can the sane leaming be provided 
for less ncney or anre leaming for the sane? 

In practice nost smmative evaluaticais focus en niiaber 4 - 
Effectiveness of Instruction ' whether or not or to degree stuients 
axe leamirjg the course content and how %iell they axe soooeeding as a 
result in subsequent omrses. Particularly inixsortant then axe pre^pxogram 
and post-prograni measures - what the student "kimr bbrfore and what the 
student knestf 2ifter participation in the program. 

ihe single group pre-test-post-test oonparison is proioably the 
design nost oonncnly used in the enr^uation of remedial programs. It is 
certainly the easiest t6 iisplenait. You hawe a group of students \Ayo 
start and finish the course and you oan^are their knowledge at the end 
with their Icnowledge at the beginning. Ubfortvnately^this design is 
oftsi of least value. Even if post-program snores are significantly 
higher tl»n pre-'pxograni scores, and they usually am, this change cannot 
be autcsnatically attributed to effectiveness of the pxogrant. IH' s single 
gxoKp pre-test-post-test- ocnparison is particularly vulnerable to a host 
of extraneous factors known as biases which distort results and cloud 
interpretations and nay be froa the peculf rarities of the student reaction 
to trst^ and testing procedures, or from leaming that tdkes plaoe but 
xx>t as a result of the remedial program. TSioaa are biases: 

1, Ttest Aaninistration Bias ; If the adidnistraticn of pretests 
differs significantly fxon that of postest. Ihe pretest may be a part of 
a large battery of tests given m a poorly lit auditoriiiiu Ttie pcsttest im>(bt^t^n 
in the snail, oonfortable class gcac^ fay a syn^thetic teacher who aBMrs 
leading questions, allows oxtza tine or ottexwise contribates to exaggerated 
ErJc giins. . 
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2. Student Attitude Blag ; Students nay isidezestinBte iaportanoe 
of pxetest era do less than their best wock so that progxaro eftetivsiess 
is over estijiBtsd or he over anxious on posttest (i«hich nay be final eMm) 
and perfozm poorly^ so that piuj i ai u effectiveness is underestimted. 

3. Teaching to the Testi Usually instructors axe faroiliar 
with oontent of posttest, thexefoce uno o nsc i OM sly or oonscicHisly th^ 
say stress topics included, oo&tfpe of algeisKa piMm omr another, for 
exanple. Ths outoone, t^ioi, is n artificial In cre as e in soos^. 

4. PractioB Effects Ihe sere estperienoe of taking the pretest 
ma prepare students to do better later cn^ they ho^ had eo^erienoe 
idth the particular fbzsnat, the use of the answer sheet, al loc ation of 
tijie, V ^ . Ihe outocxae, again, may be an artificially high post-test 
score* 

5. Instnirent Bias ; Pretest and posttest nest be validJ— 
contfflit and xnininun proficifincy level must be ap-wpiiateyend reliabla:— 
results lost be oonsistent. 

6. Hawthorne Effect s Students per fonna n o e is likely to iapxove 
siji|.ly because they are receiving special attention. These gain^^Aiich 
are genuine for tte sMperiaental geav^, ne^ not be sustained for subse- 
quent papulations* 

7. Drop-outs ; The bottcm of tte daas is sifted out rather than 
tau^ and post-test scores for those dropouts are seldom included in the 
statistical analysis. 

8. iteqressicn Otawerd tte Mean t «ttse vto initially soared at the 
extroaes will, itei xetestad, tend to sooxe tamid the middle. 
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9. E>ctBgnal Learning Bias ; Students often ispiave their basic 
skills for xeascns haviiig little or zx>thing to do with renedial instruc- 
tion - Ihe sheer excitenent of being in oolleor, particularly for the 
ncR-traditional students. 

10. History Bias ; Diis bias ocnoems the possible effect of 
aocidented or unpredictable external events on the program vDder 
evaluation — a csie-year grant resulting in smaller classes, a strUce 
or a crdjspling snow stom reduces class tijne. 

The pre-test-post-test is not necessarily hopeless if ve recognize 
ttBse biases and take steps to reduce their infl\£noes: For exasrple: 

*PBSt Attednistration Bias t Set pre test in nxxre relaxed atmospheres 
arui avoid having classrocm teachers adnlnister pest tests. 

Stud bnt Attitiidb Bias i Bermjacte students of the in^rtanoe of doing 
well cn plaoenent escan.or give pre and post tests separately — neither 
as p2aoeroent or final exer^. 

ateacfaing to itest : Do not allow iastructars* to see test — Fran a 
barfc of tests randcml> chose the one to be used. 

Practioe Affect ; Use alternate foons of test (Cover sane basic 
loaterial but in different order) . 

Instrxnent Bias ; Use tests of established validity. 

Hawthorne Effect ; OPrtcea] fran students fact that pt o grai n is being 
evaluated. 

Dropout Bias ; Ifee pre- test scores cjnly for those studeits 
who also t^ post test. Do a aeparste analysis of the drop-oits. 

To ooqiensate fxsr these H««»^ an alternative is to use a control 
groap — a groi|> of students initially aasparabla to those entering the 
piujiaut whs reoeive no ronadial instructicD or an alternate fam of 
raoedial instruction. Here we can asoine biases affect both groips 
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equally and can therefore be dlsregazded. 

looking at the oontzol group vto xeoeive no rcnediation at a31, 
the ne radiate d - imgBwediat&d ca^tariscxi. Ihe zonedial population is 
divided randcmly into 2 initially equivalent groqps. Ihe average pre- 
program measures are ocnpared to check initial equivalence. Ihe extent 
to which post'^rogram measures differ is the gauge of prograro effectiveness.. 

Ihe mjor disadvantage of this ocnpariscn is that the deliberate 
vdthholding of remediation is e^cally questionable, however until the 
ef fecti\W)ess of a remedial piugidiu is clearly demonstrated tie ethical 
questions ney be scmeMhat prenature — the program any in fact be a donm- 
right waste of time. ^Although no evidence exists regarding effectimiess . . . 
colleges offering remedial programs have been generally reluctant to 
randcmly exenpt from rewadial viOEk a portion of those students identified 
as being in need of ranediaticsi. Ihis reluctanoe has prevented the conduct 
of e}^>eri3nBntal research which mi^ enable educators to detendne vAiich 
remedial techniques are effective, for vihcm, and under %^t conditions. 

2. The Marginally Remedial, Marginally C xaupUsl OcBpariaon s Ihe 
marginally rgniedial groip are those «ho narronrly fail the pre-test and 
receive the remediation, the marginally exeqpt are tiiose %to narxwly 
pass. !Zhe asas^tion is that the turo groups are so c l o s e as to b 
considered equivalent, A^pdn fthe difference ia post-prograro measures is 
the indication of program ef^cti^«ness. !Ihis design avoids the moral 
diloana of withholding rEmediatiosv but it measures the effectiveness of 
the program only with the best of the remedial ones and we can be sure 
of a measure of suooess only if rooediated ones surpass WEaptad ones in 
the post test. If the exBu^ted surpass the remedial it is extremely 
difficult to decide wtether the ptujiam has some value or whether it is 
altogether useless and thus the results axe inequitable. 

Ifaeae designs h*^ <s aaasured xmdial ef factimess^ not evaluated it. 

ERLC 
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Die following axe designs %ihici? evaluate leandug: 

1. RBBaBdiatad-BcaiptBd Oanpariaop s Hete the pre-^^vogem naasure 
is usually the ogigi,nal plaoesent score, and the posb-pcogxam neasure 
a long-term one, - the GS^A or perfosnanoe in ooUage hivel oouxses. 
*Xhe ocRijarisGn is betaween post-pxogrsBQ aeasurss for two gzoiis. 

If the renedial g£oi;|> surpasses or even matches the e »» |>t ed one m hove 
stxxjng evidenoe the progzaro is soooassful. However, if the re roe di at ed 
groqp oontinues to lag behind no oonclxsloQ can be made with certainty 
and so this ocnparlson is inequitable, also. 

2. Noan-Groip Oarparison ? Pxe-pxogran and post-progron measures 
consist of soores on standardized tests, - Ihe iin xtownM ixt of ths local 
reroedial population is oonpared with the oorrespcnding national papulation 
en which the test was nonned. Ihis design is relatively siiiple, the 
oonparlscn is based on infomation zeadUy avallahlie and we can draw 
ccsiclucions about the value of the piugx am whether l oc al gains are hi^ier 
or lowsr that those of the norm qixxp. Qte negatiw side is the 
disparities in the nake-ip of the norm groi:|> and the local population — 
age, sex, socio-eooncnic status. 

3. Cross Program QpBparison : Ihis ccnparison e^^loyes as standard 
of suooess the achiewent of a ocmparable roaedial program, typically 

at another college. It is assined that the two remedial groqjs are initially 
equivalent/ that the two prograsns have oo np arable objecti^BS, oontcnt, 
and plaoanent prrxsedures and that the cpllegeB have agreed on ocnucn 
pre-program and post-piogram maasures. TSm major advantage here is that 
the results have foooative as well as samative values. It not only 
oon^ jyia the general effectiveness but d»w8 places where weaker progran 
^lould be modified. Ihe cross-progrsn oonpariaoD is aeldcn used, hCMewar 
because of the difficulty of locating matching p o pu l at io n s, bbjectim, 
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ocntantSr MS placaoent pcooBdoras. IliBze my also be p n a b l gn s of 
ijif>lqffifntnH<w if tl» staffs of Imd coUaqeB, faelingr tfaaaaelves 
in ocqpetition^axe leXuctant to oocpecsts. 

4. Historioal QmaxisoR t ftm atas^ is oonSuctad at a single collage 
tte ocBopariscsn being between diffoss^ Meeteis. This is particularly 
valuable «ten thaxe teve been deliberate changes either in renedial 
. prograsn or in the ornllfigft eivizcxvent. 

Khatever the evaluative design, it must be doaaly selabed to the 
ci^ectives of the renedial px u yram and its nerits oust be ga^d to aDSMar 

the fblloiring qnestionsi 

1. Is the design relatively free from the effect of biases? 

2. Is it eguititfales Does it provide enddenoe equally %ell of 

success or failure? 

3. Is it oonp«ete»ive: Does ti» sapple reflect the entire raaedial 

prcpulatiGn? 

Hecht aid Akst oondule their chapoer on progzam evaluAticn with tte 
following reocnraenCaticns. 

1. Be Objective ; Objectivity is gauged by tte esctant to %*iid\ it 
is based on oonczete, appcopriate data, ^ eei f i c criteria for auooess xfc 
failtire loust be agreed vpon in advance. 

2. Be infoCMd ; Bandliarity with range of design qpticns is ccitioal 
in plaming ttm stud^. CB» bocks froa U.S. Office of Btoation , Horst, 

Tiallaadge and Hbod 1975, TaUaafige A Bnsit, im.) 

3. Be OOBppah ^.iwt Assess not only tlie overall e^ctLveness of raaedial 
in.itruBtion,bttt also ooet effeotivaness of paasgraft, xealiability of place*- 
sent pcooedutes, ... In other weeds, ise formative procedures, particularly 
early in the progranu 

r 
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4. Be Pragnatlc s Od cnly what is pOMible» taking into a uou wt local 
limitatioDS of xesouroes, equiinent, staffing, and tine. ISie adbitloiia 
study unfinished in %iQrtii such Xess .than the nodest one aoapleted. 

5. Be PcaXitioal ; In planning and isploBenting your stud^ be alert 
to the policy of the ooUege and ths ndsgivings of the staff. Keep all 
parties infanaed of your pcoastees and awaid atra ti gi es «ftiich oay die- 
courage oooperaticn. 

6. Be Selective s Althou^ the study ^louid include all significant 
areas, choose the type of data to be collected with cate. Great massess of 
data coUectad «iithout purpose or direction is of no use. Identify pcecisely 
ths neoBSsary data, dran vp a goll'-sction schadule so that data will not 

be loBt« In the final report, highlig^ najor results so that th^ iiill not 
be lost in a sea of peripheral infomaticsi 

7. Be Prepared to Oajgrqidse: If a parti nilar direction seens to be 
treading en toes, go anotl^ %iey. Be p wp a p ad to ammuUse betMsen the 
attainable and the ideal, attthe sana tiae gaugiiiS tte eacteit to which such 
short acndngs may result in nrislearttng aoneXuslais. 

Now, with all the things in mind that I could and have done wxcng^ 
I m going to talk to you about particular stut^^ and Htm ne will look 
at studies you itiant to do. 

As anxiety reduction techniqiKHs foiaid tteir way into develoingital 
math clanrocns across the ooisitry serious quastions were being raised as 
to tte effect this ma having en the maU i aHsUoB being taught in tisse class- 
rocse. "Is the nstheswtics ia these rlaiBsn being 'wvfefsad doisin* "Is the 
»^>ject natter being replaoed by psychological procedures?" "Are the students being 
*fipaon-fad^?" "How do they survive when they get into a 'real' aath class?" 

In an attaapt to anmer theee questions I set the fo3 l owi ng rsssaroh 
project to look into 3 azeast 

1. iMjers «ho suooassfully OGspilstBd our dswslopBBntal classes. 

10 



MnteTB «l» took a ware advanoed mth oourse after ooepletion of 
our devslopnental one. 

3. Masks in tese nore advanoad oourses: X aonpared tte xefiults 
fzan classes in which math anxiety was zeoognised and txeated with those 
in whi^ it %iiBs not. 

I «n a weBter of a S-person nathaBOttics d^artsent in a 14 year old 
ccnimsiity college in Oomectioxt. All of us share an enthisi« for 
nathesoatics and a oonoezn for our students but I am the only one who feels 
the need for students to feel coniortable with ttiooaelves in a math class 
in ocder for real laaming to tatee plaoe, and, tteefoe, tte only one 
whs actually uses anxiety reduction techniq\^ in the dassroom. 

Our develGfinental course. Math 99, revietiB basic arltl MB t ic and oows 
as much algebra as seems individually feasihle. We hawe no set syllabus, 
nor do we use tte same bock, and we give different final examinaticsis. The 
course oazzles 3 cxedits tSMBsds gradaatian and, if tiie stnflent trapsgs K B after 
graduation, he/she receives 3 gaieral el^:tive credits at cur state colleges 
and university. All full-time stiklents are'^lrequired to tak»^ a math place- 
ment test, one ^Aiich we have developed. Students may also place themselves 
in the course, and wesjy older students do. 

Star 1^ 8tu^ X took all aections of our dsmsIopoeBtal ooum fzcn 
SprixK^ soKSter, 1977, throu^ Fall 198G, 1074 students. I had a 3 X 5 
file card te aac^i student en which X r eo ord e d s NasBe, N or F, aoaester. Math 99 
and otter quantitative oouraes tstei aC^^swards wdth marks and dates of 
t^dng these oourses. (X wanted age but was not lAsle to get it) . At this 
point X Bade the dedaiaD as to which statistic was to be used and X 
Giose the significanoe of d^itaBnoe betu&en p a i oa nt a qe s as interpceted 
ti» nonal cume beoausa tiie data is translated into scaled data zstiwr tiian 
ctttagocy.and it is a aoci^ pouKful statistic that thooe daaling with noainal 
data* Houem,, ansr peofOa have Baked about which asBM to be acre 
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aaally isi^rstood so to deal with ay flxst ast of cca^ariaons I will use both. 
; Nuiter of ocn^etions saRS for both groups. 
; HuDber of oonpletiaas greater for Gxoip idth Ansdety xeducticn. 

Ftrst using z);:^^ 

«t d » .01 



D.P. » 1 



6.6 





^ _ri_fi_i ArfU^^ 








3^55 












13H 






2>SS 


lOlH 



Oto get "e^Jected"? for soaBtOes 227.6 - 340 X 719 

1074 

"Did not csnplete" wes sm of ''ailtaes, WitbdcaMls, and Inaai{»letes. 



» 3.30 + 6.68 ^ 1.5 + 3.09 - 14.57 




.\ Ho rejecTCD 



X fo< ^f. = l 
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Ihe overall pioture of J Bscrqwt d eg ahoMB that aore oonpleted %dth anxiety 
HBfhirtifln and npxe failed to ooiplsts idthout. 

Qiaddng of the significance of aef fezenoes between particular pezositages is 
nore exact, ^l^ijxr the sane data X set vp the l^ypothesis that there ties no 
significant def ferenoe between the peroentages of those in clasaes who ocnpleted 
course vdth anxiety reducticn and those ndthoixt. 

H^: % ocnpletiGns vdth anxiety reduction « % ccnpletion vdthout 

^a: % with > % %athout. ( at .01 level becaur T eaqpected a large difference) 

2,01- 2.58 

P " 734 (.642^ 340 (.75) « ^^76 
734 + 340 

q SB 1 - .676 a ,324 

- (.676) (.324) _1_ + JL.-.031 

734 340 

z.-« 175 - .632 „ 3 
.031 

/> Hp rejected ? (ProbBdaiUty « .999 or 99.9 % . It is very imlitely that 
this could happen by sheer chance ^ is rejected «id the diffenmoB significant 
at t2£ .001 level or .1 % ) 




13 



12 



Second: Usinq z sooxes 

i^'- % takixig ancther math course after anxiety zedhicticn Mith 99 » % taking 
regular Math 99. 

H^* % taking ssnothsx math course after anxiety zeducticn Math 99 % taking 

regular Math 99. (isecause of reaucticQ or filtering out of students in regular 

Math 99*s X did not &^p&ct much differenoe). o(- •OS; z » 1,96 

•05 





Ototal finishing 99 


Tcxk another course 


% 


Without M.A. 


354 


166 


46.89 


With M. A. 


164 


93 


56.7 











p « (.567) (164) 4- (.469) (360) „ 455 
360 + 164 

q « .504 



<r » ^ .24998) (.00278 + .00610) « ^ 



0022198 



.047 



. 567 " .469 
.047 



2.085 




, J^jected- Again, not li3«ely that this would happen by sheer 



chsnoe 
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3. oonpariscn of maxks in foUiPiMf* mth oauraes: I xwsordBd tiie sariw 
A « 4? B - 3; C « 2; D - Ij P, W, I - 0 and chected first the F ratio Itor 
taiogeniety of variances to staw that ttey ware sanplBS feon the asm 
pcpulatican and tten proooeded to test the significance of the dif ferewe 
two maaos. 



V 


f 


A 4 


13 


B 3 


18 


C 2 


19 


D 1 


10 


F,I,W, 0 


25 


N « 


85 


X« 2 - li 


» 1.81 


85 






S « 1 180 - 










TSSt 



1.4430 



2.21374 
2.0822 



1.06 



X f 

4 34 

3 39 

2 37 

1 U 

0 52 
N - 173 



X • 2 « 2.01 
173 



S 



-(4' 



1.4879 



Itot Significant 
« « HjuxjensouB 



.05 



« 1.58 



1 



(173 -t- 85) f2.21374 2.0U2) . ^^ggg 

(173) 



z m 2.01 ' 1.81 „ 1.02 
.96 
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Haans x»t significantly difteen 




In rrmn^^1ainn then Ia sy stuc^ of those students i4lio ham bam hBlred to 
use anxiety roduction techniques in their asvelocnentaX classes as oaqpared 
to those who have not, a significantly latter peroent fini^ied the class and 
of those ^A)o finish significantly iK«e took another aath course and shew no 
significant diff * renoe in ability to oqpe with tzaditionally taught higher 
level xnathaoatics courses as indicated their imudcs. 
Ihat is they did as well as tte mich aoaller and aoce select gzoi^ lAo had 
been taught in the traditional fashion. 
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